Online Learning with Transductive Regret
نویسندگان
چکیده
We study online learning with the general notion of transductive regret, that is regret with modification rules applying to expert sequences (as opposed to single experts) that are representable by weighted finite-state transducers. We show how transductive regret generalizes existing notions of regret, including: (1) external regret; (2) internal regret; (3) swap regret; and (4) conditional swap regret. We present a general and efficient online learning algorithm for minimizing transductive regret. We further extend that to design efficient algorithms for the time-selection and sleeping expert settings. A by-product of our study is an algorithm for swap regret, which, under mild assumptions, is more efficient than existing ones, and a substantially more efficient algorithm for time selection swap regret.
منابع مشابه
Efficient Algorithms for Adversarial Contextual Learning
We provide the first oracle efficient sublinear regret algorithms for adversarial versions of the contextual bandit problem. In this problem, the learner repeatedly makes an action on the basis of a context and receives reward for the chosen action, with the goal of achieving reward competitive with a large class of policies. We analyze two settings: i) in the transductive setting the learner k...
متن کاملWithout-Replacement Sampling for Stochastic Gradient Methods: Convergence Results and Application to Distributed Optimization
Stochastic gradient methods for machine learning and optimization problems are usually analyzed assuming data points are sampled with replacement. In practice, however, sampling without replacement is very common, easier to implement in many cases, and often performs better. In this paper, we provide competitive convergence guarantees for without-replacement sampling, under various scenarios, f...
متن کاملOn Local Regret
Online learning aims to perform nearly as well as the best hypothesis in hindsight. For some hypothesis classes, though, even finding the best hypothesis offline is challenging. In such offline cases, local search techniques are often employed and only local optimality guaranteed. For online decision-making with such hypothesis classes, we introduce local regret, a generalization of regret that...
متن کاملEfficient Transductive Online Learning via Randomized Rounding
Most online algorithms used in machine learning today are based on variants of mirror descent or follow-the-leader. In this paper, we present an online algorithm based on a completely different approach, which combines “random playout” and randomized rounding of loss subgradients. As an application of our approach, we provide the first computationally efficient online algorithm for collaborativ...
متن کاملFrom Batch to Transductive Online Learning
It is well-known that everything that is learnable in the difficult online setting, where an arbitrary sequences of examples must be labeled one at a time, is also learnable in the batch setting, where examples are drawn independently from a distribution. We show a result in the opposite direction. We give an efficient conversion algorithm from batch to online that is transductive: it uses futu...
متن کامل